Efficient Summarization of URLs using CRC32 for Implementing URL Switching

نویسندگان

  • Zornitza Genova Prodanoff
  • Kenneth J. Christensen
چکیده

We investigate methods of using CRC32 for compressing Web URL strings and sharing of URL lists between servers, caches, and URL switches. Using trace-based evaluation, we compare our new CRC32 digesting method against existing Bloom filter and incremental CRC19 methods. Our CRC32 method requires less CPU resources, generates equal or smaller size digests, achieves equal collision rates, and simplifies switching.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification

Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...

متن کامل

Cluster-based Web Summarization

We propose a novel approach to abstractive Web summarization based on the observation that summaries for similar URLs tend to be similar in both content and structure. We leverage existing URL clusters and construct per-cluster word graphs that combine known summaries while abstracting out URL-specific attributes. The resulting topology, conditioned on URL features, allows us to cast the summar...

متن کامل

Performance Evaluation of URL Routing for Content Distribution Networks

As the World Wide Web continues to grow in size, content is being co-located throughout the world in Content Distribution Networks (CDNs). These CDNs need entirely new methods of distributing client requests. The idea of a URL router has been introduced and in this dissertation the performance of URL routing is addressed. A URL router that uses HTTP redirection to automatically forward requests...

متن کامل

In-memory URL Compression using AVL Tree

A common problem of large scale search engines and web spiders is how to handle a huge number of encountered URLs. Traditional search engines and web spiders use hard disk to store URLs without any compression. This results in slow performance and more space requirement. This paper describes a simple URL compression algorithm allowing efficient compression and decompression. The compression alg...

متن کامل

WebParF: A Web partitioning framework for Parallel Crawlers

With the ever proliferating size and scale of the WWW [1], efficient ways of exploring content are of increasing importance. How can we efficiently retrieve information from it through crawling? And in this “era of tera” and multi-core processors, we ought to think of multi-threaded processes as a serving solution. So, even better how can we improve the crawling performance by using parallel cr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002